#Probabilistic Causation | Explore Tumblr posts and blogs

omegaphilosophia · 11 months ago

Text

The Philosophy of Causation

The philosophy of causation delves into the nature of the relationship between cause and effect. This area of philosophy seeks to understand how and why certain events lead to particular outcomes and explores various theories and concepts related to causality. The study of causation is fundamental to numerous fields, including science, metaphysics, and everyday reasoning.

Key Concepts in the Philosophy of Causation

Causal Determinism:

Concept: The idea that every event is necessitated by antecedent events and conditions together with the laws of nature.

Argument: If causal determinism is true, then every event or state of affairs, including human actions, is the result of preceding events in accordance with universal laws.

Causal Relations and Counterfactuals:

Concept: Causal relations can often be understood in terms of counterfactual dependence: if event A had not occurred, event B would not have occurred either.

Argument: David Lewis’s counterfactual theory of causation emphasizes the importance of counterfactuals (what would have happened if things had been different) in understanding causation.

Humean Regularity Theory:

Concept: David Hume proposed that causation is nothing more than the regular succession of events; if A is regularly followed by B, we consider A to be the cause of B.

Argument: This theory suggests that causation is about patterns of events rather than any necessary connection between them.

Mechanistic Theories:

Concept: These theories emphasize the importance of mechanisms—specific processes or systems of parts that produce certain effects.

Argument: Understanding the mechanisms underlying causal relationships is crucial for explaining how causes bring about their effects.

Probabilistic Causation:

Concept: This approach deals with causes that increase the likelihood of their effects rather than deterministically bringing them about.

Argument: Probabilistic causation is essential for understanding phenomena in fields like quantum mechanics and statistics, where outcomes are not strictly determined.

Agent Causation:

Concept: This theory posits that agents (typically human beings) can initiate causal chains through their actions.

Argument: Unlike event causation, where events cause other events, agent causation places the source of causal power in agents themselves, which is significant for discussions of free will and moral responsibility.

Causal Pluralism:

Concept: The view that there are multiple legitimate ways to understand and analyze causation, depending on the context.

Argument: Causal pluralism suggests that different scientific, philosophical, and everyday contexts may require different accounts of causation.

Theoretical Perspectives on Causation

Humean vs. Non-Humean Causation:

Humean: Emphasizes regularity and contiguity in space and time between causes and effects, rejecting the notion of necessary connections.

Non-Humean: Asserts that there are genuine necessary connections in nature that underpin causal relationships.

Reductionism vs. Non-Reductionism:

Reductionism: Seeks to explain causation in terms of more fundamental phenomena, such as laws of nature or physical processes.

Non-Reductionism: Holds that causal relations are fundamental and cannot be fully explained by reducing them to other phenomena.

Causal Realism vs. Causal Anti-Realism:

Causal Realism: The belief that causal relations are objective features of the world.

Causal Anti-Realism: The belief that causal relations are not objective features of the world but rather constructs or useful fictions.

Temporal Asymmetry of Causation:

Concept: Causation is often thought to have a temporal direction, with causes preceding their effects.

Argument: Philosophers debate whether this asymmetry is a fundamental feature of reality or a result of our psychological or epistemic limitations.

The philosophy of causation is a rich and complex field that addresses fundamental questions about how and why events occur. From the deterministic framework of classical mechanics to the probabilistic nature of quantum mechanics, and from the regularity theory of Hume to contemporary mechanistic approaches, causation remains a central topic in understanding the structure of reality.

#philosophy #epistemology #knowledge #learning #education #chatgpt #ontology #metaphysics #Causation #Determinism #Counterfactuals #Humean Theory #Mechanistic Theories #Probabilistic Causation #Agent Causation #Causal Pluralism #Temporal Asymmetry

0 notes

thatfrenchacademic · 2 years ago

Text

Legal scholars claiming causation when all they have is weak ass correlation with massive signs of it being spurious, endogenous or caused by a missing variable is what will give me an ulcer.

Or a very complex and nuanced villain origin story, as the jurist who turned against her own peers and companions, screaming "YOU ARE THE ONE MAKING ME DO THIS" to them, as she puts on the dark cloak of political science, now forever tormented by her torn identity and a massive imposter syndrome.

#adventures in academia #I want to like this book #I like a lot of things about this book #but you cannot give me one (one) table of descriptive statistics #and tell me this is proof of causation #also I will start ranting about how qualitative methods are based on set theory and deterministic causality #so using stats does not wooooooork because they have a probabilistic approach to causality #STOP PLEASE

13 notes · View notes

stuartfrost · 2 months ago

Text

How Causal AI Is Transforming Industrial Digitization: Insights From Stuart Frost

In today’s fast-evolving industrial landscape, the drive for efficient, data-driven decision-making has never been more crucial. Enter Causal AI, a revolutionary force reshaping how industries embrace digitization.

At its core, causal AI enables companies to understand correlations and the actual cause-and-effect relationships within their data. This deeper insight empowers businesses to make more informed decisions, optimize processes, and predict outcomes with unprecedented accuracy. Stuart Frost, CEO of Causal AI company, Geminos, explores how Causal AI will benefit industries as they compete in an increasingly data-centric world.

Understanding Causal AI

Causal AI is quickly becoming a cornerstone in the way industries approach data analysis. While traditional data analytics often focus on identifying patterns and correlations, Causal AI digs deeper, aiming to uncover the root causes behind these patterns. This understanding enables industries to make smarter decisions, pinpointing the true drivers of performance and change. But to grasp what makes Causal AI so revolutionary, it’s essential to differentiate between mere correlations and genuine causation and to explore the mechanisms that enable Causal AI to function effectively.

In the industrial sector, the distinction between causation and correlation is critical. Correlation indicates a relationship between two variables, meaning they often move together. However, correlation does not imply that one variable causes the other to change. This is where many businesses fall into traps; they make decisions based on assumed causes, only to find that they’ve addressed symptoms rather than root problems.

Causal AI helps identify these cause-and-effect relationships, going beyond the surface to drive more precise and effective industrial strategies. It’s like having a map that shows not just the roads but also which ones actually lead to your destination.

“Causal AI employs several methodologies to identify and analyze causal relationships,” says Stuart Frost. “One of the primary methods is causal inference, which uses statistical models to determine cause-and-effect links. This method goes beyond traditional statistical techniques by focusing on how variables interact in their natural settings.”

Graphical models called DAGs (Directed Acyclic Graphs) are a cornerstone of Causal AI. They represent the probabilistic relationships among variables. These models help in mapping out potential scenarios and understanding how changes in one variable affect others and they are a great communication tool for business analysts, data scientists and subject matter experts. Then there’s structural equation modeling, which combines statistical data with causal assumptions to model complex relationships. This approach allows industries to build comprehensive models that reflect real-world complexities.

Together, these methods equip industries with tools to not only identify causation but also to simulate the outcomes of various decisions, leading to optimized processes and forward-thinking strategies.

Impact of Causal AI on Industrial Processes

Causal AI is driving a paradigm shift in industrial processes, empowering businesses with actionable insights that direct their operational strategies. As organizations strive to enhance efficiency and effectiveness, Causal AI stands out with its focus on cause-and-effect, enabling industries to not just react to changes but anticipate and influence them. Let’s explore how it’s reshaping key areas like maintenance, supply chain, and quality control.

Notes Frost, “Imagine a factory floor where machines can predict when they need repairs. Causal AI makes this possible by offering insights that conventional analytics can miss.”

By understanding the causal links between machine usage and potential failures, industries can switch from reactive maintenance to predictive strategies. This ability to foresee and address issues before they escalate reduces costly downtime and boosts overall efficiency. Maintenance schedules become more dynamic, adapting to real-time conditions rather than routine checks, ensuring machinery operates at peak performance with minimal interruptions.

The supply chain is the heartbeat of manufacturing operations, yet it is often vulnerable to disruptions. Causal AI helps untangle the complexities of supply chain dynamics by pinpointing the causal factors that influence production and logistics. It sifts through vast datasets to identify hidden patterns, providing businesses with a blueprint for optimizing their supply chains.

Maintaining high-quality standards is crucial in any industry. Causal AI strengthens quality control by moving beyond superficial data patterns to reveal the underlying causes of defects. By identifying and addressing these root causes, businesses can implement improvements that prevent recurring issues. This proactive approach not only enhances product quality but also reduces wastage and recalls, leading to substantial cost savings. It is like having a digital detective on hand, ready to solve the mystery of defects before they affect the final product.

Challenges and Considerations

As industries embrace Causal AI to drive digitization, they face numerous challenges. From overcoming data obstacles to navigating organizational culture shifts, understanding and addressing these issues is crucial. This section explores these key challenges and offers insights into tackling them effectively.

“Handling data in the context of Causal AI is no small task. Data must be high-quality, unbiased, and free from noise to produce accurate outcomes,” says Frost.

Noise in data can act like static on a radio, interfering with the clear signal you’re trying to capture, which in AI terms translates to misleading insights. To combat this, industries are employing rigorous data-cleaning methods. Pre-processing data with tools that detect and filter out noise ensures that only relevant, clean data is used in modeling.

Bias in data is another formidable hurdle. Bias can skew results and lead to faulty conclusions, much like a biased umpire skewing the outcome of a game. To mitigate this, Causal AI’s graphical models can be used to identify and eliminate potential sources of bias.

Integrating Causal AI into existing frameworks requires more than just technical adjustments. It demands a cultural shift within organizations. Resistance to change is common, much like how a ship resists a change in course despite needing to head in a new direction. Overcoming this inertia requires strong leadership and a clear vision of the potential of Causal AI.

Education is a key component in driving this cultural change. By investing in training and development, organizations can build a workforce well-versed in AI technologies. When employees understand the benefits and workings of Causal AI, they are more likely to embrace it. Additionally, creating cross-functional teams encourages collaboration, fostering a shared sense of purpose and breaking down silos that might resist new technology.

Organizational structures may also need to evolve. Decision-making can no longer rely solely on intuition but should be data-driven. This shift can be likened to a transition from gut-feel navigation to compass-guided travel. Companies that adapt by fostering a culture of data-driven decision-making often find themselves more agile and competitive.

Causal AI is revolutionizing industrial digitization, offering a profound shift in how industries operate. By unveiling the cause-and-effect dynamics entrenched in vast datasets, it facilitates intelligent decision-making and strategic planning. This technology pushes past traditional analytics, allowingImpact of Causal AI on Industrial Processes industries not only to react but to anticipate changes, streamlining processes across maintenance, supply chain, and quality control.

As the industrial landscape continues to evolve, the integration of Causal AI with emerging technologies like IoT and big data remains crucial. This convergence promises enhanced operational efficiencies and innovative pathways. Industries that embrace Causal AI now secure a competitive edge, paving the way for future advancements.

Originally Published At: https://techbullion.com/ On November 26, 2024

#Stuart Frost #stuartfrost

1 note · View note

flicknova · 11 months ago

Text

youtube

Diaphragmatic Breathing Morphic Field - 16, 5.35 & 2.5Hz Isochronic Tones - Diaphragm Chakra Healing

For More Powerful and Exclusive Morphic Fields, You can join me on Patreon: -https://www.patreon.com/flicknova

To Purchase Image Morphic Fields (Energetically Programmed Image): -https://www.patreon.com/flicknova/shop

Note:- This Morphic Field is safe from all types of Food Intolerances and Allergies.

Note:- This Morphic Field can be played muted (Effect will be between 90% to 100%)

Morphic Fields:

A morphic field (a term introduced by Rupert Sheldrake, the major proponent of this concept, through his Hypothesis of Formative Causation) is described as consisting of patterns that govern the development of forms, structures and arrangements. The theory of morphic fields is not accepted by mainstream science.

Morphic fields are defined as the universal database for both organic (living) and abstract (mental) forms, while morphogenetic fields are defined by Sheldrake as the subset of morphic fields which influence, and are influenced by living things (the term morphogenetic fields was already in use in environmental biology in the 1920's, having been used in unrelated research of three biologists - Hans Spemann, Alexander Gurwitsch and Paul Weiss).

“The term [morphic field] is more general in its meaning than morphogenetic fields, and includes other kinds of organizing fields in addition to those of morphogenesis; the organizing fields of animal and human behaviour, of social and cultural systems, and of mental activity can all be regarded as morphic fields which contain an inherent memory.” - Sheldrake, The Presence of the Past (Chapter 6, page 112)

References:- Sheldrake, Rupert (1995). Nature As Alive: Morphic Resonance and Collective Memory. Source: [1] (Accessed: Thursday, 1 March 2007)

Morphic Fields Summary:

The hypothesized properties of morphic fields at all levels of complexity can be summarized as follows:

They are self-organizing wholes.

They have both a spatial and a temporal aspect, and organize spatio-temporal patterns of vibratory or rhythmic activity.

They attract the systems under their influence towards characteristic forms and patterns of activity, whose coming-into-being they organize and whose integrity they maintain. The ends or goals towards which morphic fields attract the systems under their influence are called attractors. The pathways by which systems usually reach these attractors are called chreodes.

They interrelate and co-ordinate the morphic units or holons that lie within them, which in turn are wholes organized by morphic fields. Morphic fields contain other morphic fields within them in a nested hierarchy or holarchy.

They are structures of probability, and their organizing activity is probabilistic.

They contain a built-in memory given by self-resonance with a morphic unit's own past and by morphic resonance with all previous similar systems. This memory is cumulative. The more often particular patterns of activity are repeated, the more habitual they tend to become.

Feel free to listen it on loop you can sleep with it playing on loop.

Do not listen to it while you are Driving or Operating Machinery, Only listen while you are ready for it.

LEGAL DISCAIMER Please be aware that the information provided to you is not evaluated or endorsed by the FDA, and it should not be used as a means of diagnosis or medical advice. The statements shared contain information that listeners can choose to use or disregard according to their own judgment. It is strongly recommended that you consult your primary care physician or a reputable hospital for any urgent medical treatment or advice. This program aims to offer general health benefits and falls under the category of class I. It is not intended to govern any medical or biological information as outlined by FDA guidelines. This channel solely serves as a source of information and does not provide diagnostic or treatment solutions for medical conditions.

1 note · View note

wolfliving · 5 years ago

Text

The Boris Johnson Government is hiring

*This is ranking about a 9.1 on the fubarometer.

https://dominiccummings.com/2020/01/02/two-hands-are-a-lot-were-hiring-data-scientists-project-managers-policy-experts-assorted-weirdos/

JANUARY 2, 2020

DOMINIC CUMMINGS

‘Two hands are a lot’ — we’re hiring data scientists, project managers, policy experts, assorted weirdos…

‘This is possibly the single largest design flaw contributing to the bad Nash equilibrium in which … many governments are stuck. Every individual high-functioning competent person knows they can’t make much difference by being one more face in that crowd.’ Eliezer Yudkowsky, AI expert, LessWrong etc.

‘[M]uch of our intellectual elite who think they have “the solutions” have actually cut themselves off from understanding the basis for much of the most important human progress.’ Michael Nielsen, physicist and one of the handful of most interesting people I’ve ever talked to.

‘People, ideas, machines — in that order.’ Colonel Boyd.

‘There isn’t one novel thought in all of how Berkshire [Hathaway] is run. It’s all about … exploiting unrecognized simplicities.’ Charlie Munger,Warren Buffett’s partner.

‘Two hands, it isn’t much considering how the world is infinite. Yet, all the same, two hands, they are a lot.’ Alexander Grothendieck, one of the great mathematicians.

There are many brilliant people in the civil service and politics. Over the past five months the No10 political team has been lucky to work with some fantastic officials. But there are also some profound problems at the core of how the British state makes decisions. This was seen by pundit-world as a very eccentric view in 2014. It is no longer seen as eccentric. Dealing with these deep problems is supported by many great officials, particularly younger ones, though of course there will naturally be many fears — some reasonable, most unreasonable.

Now there is a confluence of: a) Brexit requires many large changes in policy and in the structure of decision-making, b) some people in government are prepared to take risks to change things a lot, and c) a new government with a significant majority and little need to worry about short-term unpopularity while trying to make rapid progress with long-term problems.

There is a huge amount of low hanging fruit — trillion dollar bills lying on the street — in the intersection of:

the selection, education and training of people for high performance

the frontiers of the science of prediction

data science, AI and cognitive technologies (e.g Seeing Rooms, ‘authoring tools designed for arguing from evidence’, Tetlock/IARPA prediction tournaments that could easily be extended to consider ‘clusters’ of issues around themes like Brexit to improve policy and project management)

communication (e.g Cialdini)

decision-making institutions at the apex of government.

We want to hire an unusual set of people with different skills and backgrounds to work in Downing Street with the best officials, some as spads and perhaps some as officials. If you are already an official and you read this blog and think you fit one of these categories, get in touch.

The categories are roughly:

Data scientists and software developers

Economists

Policy experts

Project managers

Communication experts

Junior researchers one of whom will also be my personal assistant

Weirdos and misfits with odd skills

We want to improve performance and make me much less important — and within a year largely redundant. At the moment I have to make decisions well outside what Charlie Munger calls my ‘circle of competence’ and we do not have the sort of expertise supporting the PM and ministers that is needed. This must change fast so we can properly serve the public.

A. Unusual mathematicians, physicists, computer scientists, data scientists

You must have exceptional academic qualifications from one of the world’s best universities or have done something that demonstrates equivalent (or greater) talents and skills. You do not need a PhD — as Alan Kay said, we are also interested in graduate students as ‘world-class researchers who don’t have PhDs yet’.

You should have the following:

PhD or MSc in maths or physics.

Outstanding mathematical skills are essential.

Experience of using analytical languages: e.g. Python, SQL, R.

Familiarity with data tools and technologies such as Postgres, Scikit Learn, NEO4J.

A few examples of papers that you will be considering:

This Nature paper, Early warning signals for critical transitions in a thermoacoustic system, looking at early warning systems in physics that could be applied to other areas from finance to epidemics.

Statistical & ML forecasting methods: Concerns and ways forward, Spyros Makridakis, 2018. This compares statistical and ML methods in a forecasting tournament (won by a hybrid stats/ML approach).

Complex Contagions : A Decade in Review, 2017. This looks at a large number of studies on ‘what goes viral and why?’. A lot of studies in this field are dodgy (bad maths, don’t replicate etc), an important question is which ones are worth examining.

Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach, 2018. This applies ML to predict chaotic systems.

Scale-free networks are rare, Nature 2019. This looks at the question of how widespread scale-free networks really are and how useful this approach is for making predictions in diverse fields.

On the frequency and severity of interstate wars, 2019. ‘How can it be possible that the frequency and severity of interstate wars are so consistent with a stationary model, despite the enormous changes and obviously non-stationary dynamics in human population, in the number of recognized states, in commerce, communication, public health, and technology, and even in the modes of war itself? The fact that the absolute number and sizes of wars are plausibly stable in the face of these changes is a profound mystery for which we have no explanation.’ Does this claim stack up?

The papers on computational rationality below.

The work of Judea Pearl, the leading scholar of causation who has transformed the field.

You should be able to explain to other mathematicians, physicists and computer scientists the ideas in such papers, discuss what could be useful for our projects, synthesise ideas for other data scientists, and apply them to practical problems. You won’t be expert on the maths used in all these papers but you should be confident that you could study it and understand it.

We will be using machine learning and associated tools so it is important you can program. You do not need software development levels of programming but it would be an advantage.

Those applying must watch Bret Victor’s talks and study Dynamic Land. If this excites you, then apply; if not, then don’t. I and others interviewing will discuss this with anybody who comes for an interview. If you want a sense of the sort of things you’d be working on, then read my previous blog on Seeing Rooms, cognitive technologies etc.

B. Unusual software developers

We are looking for great software developers who would love to work on these ideas, build tools and work with some great people. You should also look at some of Victor’s technical talks on programming languages and the history of computing.

You will be working with data scientists, designers and others.

C. Unusual economists

We are looking to hire some recent graduates in economics. You should a) have an outstanding record at a great university, b) understand conventional economic theories, c) be interested in arguments on the edge of the field — for example, work by physicists on ‘agent-based models’ or by the hedge fund Bridgewater on the failures/limitations of conventional macro theories/prediction, and d) have very strong maths and be interested in working with mathematicians, physicists, and computer scientists.

The ideal candidate might, for example, have a degree in maths and economics, worked at the LHC in one summer, worked with a quant fund another summer, and written software for a YC startup in a third summer!

We’ve found one of these but want at least one more.

The sort of conversation you might have is discussing these two papers in Science (2015): Computational rationality: A converging paradigm for intelligence in brains, minds, and machines, Gershman et al and Economic reasoning and artificial intelligence, Parkes & Wellman.

You will see in these papers an intersection of:

von Neumann’s foundation of game theory and ‘expected utility’,

mainstream economic theories,

modern theories about auctions,

theoretical computer science (including problems like the complexity of probabilistic inference in Bayesian networks, which is in the NP–hard complexity class),

ideas on ‘computational rationality’ and meta-reasoning from AI, cognitive science and so on.

If these sort of things are interesting, then you will find this project interesting.

It’s a bonus if you can code but it isn’t necessary.

D. Great project managers.

If you think you are one of the a small group of people in the world who are truly GREAT at project management, then we want to talk to you. Victoria Woodcock ran Vote Leave — she was a truly awesome project manager and without her Cameron would certainly have won. We need people like this who have a 1 in 10,000 or higher level of skill and temperament.

The Oxford Handbook on Megaprojects points out that it is possible to quantify lessons from the failures of projects like high speed rail projects because almost all fail so there is a large enough sample to make statistical comparisons, whereas there can be no statistical analysis of successes because they are so rare.

It is extremely interesting that the lessons of Manhattan (1940s), ICBMs (1950s) and Apollo (1960s) remain absolutely cutting edge because it is so hard to apply them and almost nobody has managed to do it. The Pentagon systematically de-programmed itself from more effective approaches to less effective approaches from the mid-1960s, in the name of ‘efficiency’. Is this just another way of saying that people like General Groves and George Mueller are rarer than Fields Medallists?

Anyway — it is obvious that improving government requires vast improvements in project management. The first project will be improving the people and skills already here.

If you want an example of the sort of people we need to find in Britain, look at this on CC Myers — the legendary builders. SPEED. We urgently need people with these sort of skills and attitude. (If you think you are such a company and you could dual carriageway the A1 north of Newcastle in record time, then get in touch!)

E. Junior researchers

In many aspects of government, as in the tech world and investing, brains and temperament smash experience and seniority out of the park.

We want to hire some VERY clever young people either straight out of university or recently out with with extreme curiosity and capacity for hard work.

One of you will be a sort of personal assistant to me for a year — this will involve a mix of very interesting work and lots of uninteresting trivia that makes my life easier which you won’t enjoy. You will not have weekday date nights, you will sacrifice many weekends — frankly it will hard having a boy/girlfriend at all. It will be exhausting but interesting and if you cut it you will be involved in things at the age of ~21 that most people never see.

I don’t want confident public school bluffers. I want people who are much brighter than me who can work in an extreme environment. If you play office politics, you will be discovered and immediately binned.

F. Communications

In SW1 communication is generally treated as almost synonymous with ‘talking to the lobby’. This is partly why so much punditry is ‘narrative from noise’.

With no election for years and huge changes in the digital world, there is a chance and a need to do things very differently.

We’re particularly interested in deep experts on TV and digital. We also are interested in people who have worked in movies or on advertising campaigns. There are some very interesting possibilities in the intersection of technology and story telling — if you’ve done something weird, this may be the place for you.

I noticed in the recent campaign that the world of digital advertising has changed very fast since I was last involved in 2016. This is partly why so many journalists wrongly looked at things like Corbyn’s Facebook stats and thought Labour was doing better than us — the ecosystem evolves rapidly while political journalists are still behind the 2016 tech, hence why so many fell for Carole’s conspiracy theories. The digital people involved in the last campaign really knew what they are doing, which is incredibly rare in this world of charlatans and clients who don’t know what they should be buying. If you are interested in being right at the very edge of this field, join.

We have some extremely able people but we also must upgrade skills across the spad network.

G. Policy experts

One of the problems with the civil service is the way in which people are shuffled such that they either do not acquire expertise or they are moved out of areas they really know to do something else. One Friday, X is in charge of special needs education, the next week X is in charge of budgets.

There are, of course, general skills. Managing a large organisation involves some general skills. Whether it is Coca Cola or Apple, some things are very similar — how to deal with people, how to build great teams and so on. Experience is often over-rated. When Warren Buffett needed someone to turn around his insurance business he did not hire someone with experience in insurance: ‘When Ajit entered Berkshire’s office on a Saturday in 1986, he did not have a day’s experience in the insurance business’ (Buffett).

Shuffling some people who are expected to be general managers is a natural thing but it is clear Whitehall does this too much while also not training general management skills properly. There are not enough people with deep expertise in specific fields.

If you want to work in the policy unit or a department and you really know your subject so that you could confidently argue about it with world-class experts, get in touch.

It’s also the case that wherever you are most of the best people are inevitably somewhere else. This means that governments must be much better at tapping distributed expertise. Of the top 20 people in the world who best understand the science of climate change and could advise us what to do with COP 2020, how many now work as a civil servant/spad or will become one in the next 5 years?

G. Super-talented weirdos

People in SW1 talk a lot about ‘diversity’ but they rarely mean ‘true cognitive diversity’. They are usually babbling about ‘gender identity diversity blah blah’. What SW1 needs is not more drivel about ‘identity’ and ‘diversity’ from Oxbridge humanities graduates but more genuine cognitive diversity.

We need some true wild cards, artists, people who never went to university and fought their way out of an appalling hell hole, weirdos from William Gibson novels like that girl hired by Bigend as a brand ‘diviner’ who feels sick at the sight of Tommy Hilfiger or that Chinese-Cuban free runner from a crime family hired by the KGB. If you want to figure out what characters around Putin might do, or how international criminal gangs might exploit holes in our border security, you don’t want more Oxbridge English graduates who chat about Lacan at dinner parties with TV producers and spread fake news about fake news.

By definition I don’t really know what I’m looking for but I want people around No10 to be on the lookout for such people.

We need to figure out how to use such people better without asking them to conform to the horrors of ‘Human Resources’ (which also obviously need a bonfire).

Send a max 1 page letter plus CV to [email protected] and put in the subject line ‘job/’ and add after the / one of: data, developer, econ, comms, projects, research, policy, misfit.

I’ll have to spend time helping you so don’t apply unless you can commit to at least 2 years.

I’ll bin you within weeks if you don’t fit — don’t complain later because I made it clear now.

I will try to answer as many as possible but last time I publicly asked for job applications in 2015 I was swamped and could not, so I can’t promise an answer. If you think I’ve insanely ignored you, persist for a while.

I will use this blog to throw out ideas. It’s important when dealing with large organisations to dart around at different levels, not be stuck with formal hierarchies. It will seem chaotic and ‘not proper No10 process’ to some. But the point of this government is to do things differently and better and this always looks messy. We do not care about trying to ‘control the narrative’ and all that New Labour junk and this government will not be run by ‘comms grid’.

As Paul Graham and Peter Thiel say, most ideas that seem bad are bad but great ideas also seem at first like bad ideas — otherwise someone would have already done them. Incentives and culture push people in normal government systems away from encouraging ‘ideas that seem bad’. Part of the point of a small, odd No10 team is to find and exploit, without worrying about media noise, what Andy Grove called ‘very high leverage ideas’ and these will almost inevitably seem bad to most.

I will post some random things over the next few weeks and see what bounces back — it is all upside, there’s no downside if you don’t mind a bit of noise and it’s a fast cheap way to find good ideas…

2 notes · View notes

assignmentstips · 3 years ago

Text

What is the role of Strategic HRM in large Organizations who value diversity?

3.2 Action Required: Chapter Name: Read Ch-2- “Developing Marketing Strategies and a Marketing Plan” from the text book- Dhruv Grewal and Michael Levy (2020) “Marketing” (8th Edition). McGraw-Hill Education, Digital Version: ISBN13: 978-1-260-71743-3. 3.3 Test your Knowledge (Question): Discussion Question #1: From any brand of your choice, locate a SWOT analysis for a brand you currently use. For each section, list an additional item that was not part of the initial analysis and discuss why you included this item. 3.2 Action Required: Watch the video and read the article in the following link: https://www.investopedia.com/terms/f/financial-sta… 3.3 Test your Knowledge (Question): Defined financials statement analysis (with two examples)? 3.2 Action Required: “Sometimes when confronted with a situation in HR, we try to solve it quickly based on experience. Or we immediately address what we think is the causation. When we start midway into the process, we miss the opportunity to ask the right questions. That’s what enables us to examine all the options.” 3.3 Test your Knowledge (Question): Keeping the above statement in mind answer the question given below: What is the role of Strategic HRM in large Organizations who value diversity? 2.3 Test your Knowledge (Question): Discuss the relationship between globalization and national sovereignty. Maximum word limit: 250 words 3.3 Test your Knowledge (Question): Compare and contrast political economy and political system. Maximum word limit: 250 words Purpose To assess your ability to: apply the terminology of decision making to describe business problems compare and contrast deterministic and probabilistic models Action Items Using the definitions found in Chapter 1 of Quantitative Analysis, the Internet, and your own personal experiences, make notes on and post one example of each of the following to the class Discussion Board topic “Deterministic and Probabilistic Models”. A deterministic model; A probabilistic model; and A situation in which you could use post optimality analysis (also known as sensitivity analysis).

First appeared on Assignments.tips

0 notes

craigbrownphd-blog-blog · 5 years ago

Text

If you did not already know

Mean Field Reinforcement Learning (MFRL) Existing multi-agent reinforcement learning methods are limited typically to a small number of agents. When the agent number increases largely, the learning becomes intractable due to the curse of the dimensionality and the exponential growth of user interactions. In this paper, we present Mean Field Reinforcement Learning where the interactions within the population of agents are approximated by those between a single agent and the average effect from the overall population or neighboring agents; the interplay between the two entities is mutually reinforced: the learning of the individual agent’s optimal policy depends on the dynamics of the population, while the dynamics of the population change according to the collective patterns of the individual policies. We develop practical mean field Q-learning and mean field Actor-Critic algorithms and analyze the convergence of the solution. Experiments on resource allocation, Ising model estimation, and battle game tasks verify the learning effectiveness of our mean field approaches in handling many-agent interactions in population. … Monica Can you remember the names of the children of all your friends? Can you remember the wedding anniversary of your brother? Can you tell the last time you called your grand mother and what you talked about? Monica lets you quickly and easily log all those information so you can be a better friend, family member or spouse. … Probabilistic Causation Probabilistic causation is a concept in a group of philosophical theories that aim to characterize the relationship between cause and effect using the tools of probability theory. The central idea behind these theories is that causes raise the probabilities of their effects, all else being equal. Interpreting causation as a deterministic relation means that if A causes B, then A must always be followed by B. In this sense, war does not cause deaths, nor does smoking cause cancer. As a result, many turn to a notion of probabilistic causation. Informally, A probabilistically causes B if A’s occurrence increases the probability of B. This is sometimes interpreted to reflect imperfect knowledge of a deterministic system but other times interpreted to mean that the causal system under study has an inherently indeterministic nature. (Propensity probability is an analogous idea, according to which probabilities have an objective existence and are not just limitations in a subject’s knowledge). Philosophers such as Hugh Mellor and Patrick Suppes have defined causation in terms of a cause preceding and increasing the probability of the effect. … Self-Imitation Learning (SIL) This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent’s past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration. Our empirical results show that SIL significantly improves advantage actor-critic (A2C) on several hard exploration Atari games and is competitive to the state-of-the-art count-based exploration methods. We also show that SIL improves proximal policy optimization (PPO) on MuJoCo tasks. … https://bit.ly/30eqxjv

0 notes

blockgeni · 5 years ago

Text

The Real Importance of Data Preparation

In a world focused on buzzword-driven models and algorithms, you��d be forgiven for forgetting about the unreasonable importance of data preparation and quality: your models are only as good as the data you feed them. This is the garbage in, garbage out principle: flawed data going in leads to flawed results, algorithms, and business decisions. If a self-driving car’s decision-making algorithm is trained on data of traffic collected during the day, you wouldn’t put it on the roads at night. To take it a step further, if such an algorithm is trained in an environment with cars driven by humans, how can you expect it to perform well on roads with other self-driving cars? Beyond the autonomous driving example described, the “garbage in” side of the equation can take many forms—for example, incorrectly entered data, poorly packaged data, and data collected incorrectly, more of which we’ll address below. When executives ask me how to approach an AI transformation, I show them Monica Rogati’s AI Hierarchy of Needs, which has AI at the top, and everything is built upon the foundation of data (Rogati is a data science and AI advisor, former VP of data at Jawbone, and former LinkedIn data scientist): Why is high-quality and accessible data foundational? If you’re basing business decisions on dashboards or the results of online experiments, you need to have the right data. On the machine learning side, we are entering what Andrei Karpathy, director of AI at Tesla, dubs the Software 2.0 era, a new paradigm for software where machine learning and AI require less focus on writing code and more on configuring, selecting inputs, and iterating through data to create higher level models that learn from the data we give them. In this new world, data has become a first-class citizen, where computation becomes increasingly probabilistic and programs no longer do the same thing each time they run. The model and the data specification become more important than the code. Collecting the right data requires a principled approach that is a function of your business question. Data collected for one purpose can have limited use for other questions. The assumed value of data is a myth leading to inflated valuations of start-ups capturing said data. John Myles White, data scientist and engineering manager at Facebook, wrote: “The biggest risk I see with data science projects is that analyzing data per se is generally a bad thing. Generating data with a pre-specified analysis plan and running that analysis is good. Re-analyzing existing data is often very bad.” John is drawing attention to thinking carefully about what you hope to get out of the data, what question you hope to answer, what biases may exist, and what you need to correct before jumping in with an analysis. With the right mindset, you can get a lot out of analyzing existing data—for example, descriptive data is often quite useful for early-stage companies. Not too long ago, “save everything” was a common maxim in tech; you never knew if you might need the data. However, attempting to repurpose pre-existing data can muddy the water by shifting the semantics from why the data was collected to the question you hope to answer. In particular, determining causation from correlation can be difficult. For example, a pre-existing correlation pulled from an organization’s database should be tested in a new experiment and not assumed to imply causation, instead of this commonly encountered pattern in tech: A large fraction of users that do X do Z Z is good Let’s get everybody to do X Correlation in existing data is evidence for causation that then needs to be verified by collecting more data. The same challenge plagues scientific research. Take the case of Brian Wansink, former head of the Food and Brand Lab at Cornell University, who stepped down after a Cornell faculty review reported he “committed academic misconduct in his research and scholarship, including misreporting of research data, problematic statistical techniques [and] failure to properly document and preserve research results.” One of his more egregious errors was to continually test already collected data for new hypotheses until one stuck, after his initial hypothesis failed. NPR put it well: “the gold standard of scientific studies is to make a single hypothesis, gather data to test it, and analyze the results to see if it holds up. By Wansink’s own admission in the blog post, that’s not what happened in his lab.” He continually tried to fit new hypotheses unrelated to why he collected the data until he got a null hypothesis with an acceptable p-value—a perversion of the scientific method.

Data professionals spend an inordinate amount on time cleaning, repairing, and preparing data

Before you even think about sophisticated modeling, state-of-the-art machine learning, and AI, you need to make sure your data is ready for analysis—this is the realm of data preparation. You may picture data scientists building machine learning models all day, but the common trope that they spend 80% of their time on data preparation is closer to the truth. This is old news in many ways, but it’s old news that still plagues us: a recent O’Reilly survey found that lack of data or data quality issues was one of the main bottlenecks for further AI adoption for companies at the AI evaluation stage and was the main bottleneck for companies with mature AI practices. Good quality datasets are all alike, but every low-quality dataset is low-quality in its own way. Data can be low-quality if: It doesn’t fit your question or its collection wasn’t carefully considered; It’s erroneous (it may say “cicago” for a location), inconsistent (it may say “cicago” in one place and “Chicago” in another), or missing; It’s good data but packaged in an atrocious way—e.g., it’s stored across a range of siloed databases in an organization; It requires human labeling to be useful (such as manually labeling emails as “spam” or “not” for a spam detection algorithm). This definition of low-quality data defines quality as a function of how much work is required to get the data into an analysis-ready form. Look at the responses to my tweet for data quality nightmares that modern data professionals grapple with.

The importance of automating data preparation

Most of the conversation around AI automation involves automating machine learning models, a field known as AutoML. This is important: consider how many modern models need to operate at scale and in real time (such as Google’s search engine and the relevant tweets that Twitter surfaces in your feed). We also need to be talking about automation of all steps in the data science workflow/pipeline, including those at the start. Why is it important to automate data preparation? It occupies an inordinate amount of time for data professionals. Data drudgery automation in the era of data smog will free data scientists up for doing more interesting, creative work (such as modeling or interfacing with business questions and insights). “76% of data scientists view data preparation as the least enjoyable part of their work,” according to a CrowdFlower survey. A series of subjective data preparation micro-decisions can bias your analysis. For example, one analyst may throw out data with missing values, another may infer the missing values. For more on how micro-decisions in analysis can impact results, I recommend Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results(note that the analytical micro-decisions in this study are not only data preparation decisions). Automating data preparation won’t necessarily remove such bias, but it will make it systematic, discoverable, auditable, unit-testable, and correctable. Model results will then be less reliant on individuals making hundreds of micro-decisions. An added benefit is that the work will be reproducible and robust, in the sense that somebody else (say, in another department) can reproduce the analysis and get the same results; For the increasing number of real-time algorithms in production, humans need to be taken out of the loop at runtime as much as possible (and perhaps be kept in the loop more as algorithmic managers): when you use Siri to make a reservation on OpenTable by asking for a table for four at a nearby Italian restaurant tonight, there’s a speech-to-text model, a geographic search model, and a restaurant-matching model, all working together in real time. No data analysts/scientists work on this data pipeline as everything must happen in real time, requiring an automated data preparation and data quality workflow (e.g., to resolve if I say “eye-talian” instead of “it-atian”). The third point above speaks more generally to the need for automation around all parts of the data science workflow. This need will grow as smart devices, IoT, voice assistants, drones, and augmented and virtual reality become more prevalent. Automation represents a specific case of democratization, making data skills easily accessible for the broader population. Democratization involves both education (which I focus on in my work at DataCamp) and developing tools that many people can use. Understanding the importance of general automation and democratization of all parts of the DS/ML/AI workflow, it’s important to recognize that we’ve done pretty well at democratizing data collection and gathering, modeling, and data reporting, but what remains stubbornly difficult is the whole process of preparing the data.

Modern tools for automating data cleaning and data preparation

We’re seeing the emergence of modern tools for automated data cleaning and preparation, such as HoloClean and Snorkel coming from Christopher Ré’s group at Stanford. HoloClean decouples the task of data cleaning into error detection (such as recognizing that the location “cicago” is erroneous) and repairing erroneous data (such as changing “cicago” to “Chicago”), and formalizes the fact that “data cleaning is a statistical learning and inference problem.” All data analysis and data science work is a combination of data, assumptions, and prior knowledge. So when you’re missing data or have “low-quality data,” you use assumptions, statistics, and inference to repair your data. HoloClean performs this automatically in a principled, statistical manner. All the user needs to do is “to specify high-level assertions that capture their domain expertise with respect to invariants that the input data needs to satisfy. No other supervision is required!” The HoloClean team also has a system for automating the “building and managing [of] training datasets without manual labeling” called Snorkel. Having correctly labeled data is a key part of preparing data to build machine learning models. As more and more data is generated, manually labeling it is unfeasible. Snorkel provides a way to automate labeling, using a modern paradigm called data programming, in which users are able to “inject domain information [or heuristics] into machine learning models in higher level, higher bandwidth ways than manually labeling thousands or millions of individual data points.” Researchers at Google AI have adapted Snorkel to label data at industrial/web scale and demonstrated its utility in three scenarios: topic classification, product classification, and real-time event classification. Snorkel doesn’t stop at data labeling. It also allows you to automate two other key aspects of data preparation: Data augmentation—that is, creating more labeled data. Consider an image recognition problem in which you are trying to detect cars in photos for your self-driving car algorithm. Classically, you’ll need at least several thousand labeled photos for your training dataset. If you don’t have enough training data and it’s too expensive to manually collect and label more data, you can create more by rotating and reflecting your images. Discovery of critical data subsets—for example, figuring out which subsets of your data really help to distinguish spam from non-spam. These are two of many current examples of the augmented data preparation revolution, which includes products from IBM and DataRobot.

The future of data tooling and data preparation as a cultural challenge

So what does the future hold? In a world with an increasing number of models and algorithms in production, learning from large amounts of real-time streaming data, we need both education and tooling/products for domain experts to build, interact with, and audit the relevant data pipelines. We’ve seen a lot of headway made in democratizing and automating data collection and building models. Just look at the emergence of drag-and-drop tools for machine learning workflows coming out of Google and Microsoft. As we saw from the recent O’Reilly survey, data preparation and cleaning still take up a lot of time that data professionals don’t enjoy. For this reason, it’s exciting that we’re now starting to see headway in automated tooling for data cleaning and preparation. It will be interesting to see how this space grows and how the tools are adopted. A bright future would see data preparation and data quality as first-class citizens in the data workflow, alongside machine learning, deep learning, and AI. Dealing with incorrect or missing data is unglamorous but necessary work. It’s easy to justify working with data that’s obviously wrong; the only real surprise is the amount of time it takes. Understanding how to manage more subtle problems with data, such as data that reflects and perpetuates historical biases (for example, real estate redlining) is a more difficult organizational challenge. This will require honest, open conversations in any organization around what data workflows actually look like. The fact that business leaders are focused on predictive models and deep learning while data workers spend most of their time on data preparation is a cultural challenge, not a technical one. If this part of the data flow pipeline is going to be solved in the future, everybody needs to acknowledge and understand the challenge. #datapreparation#AIHierarchy#datascience#Software2.0#Dataprofessionals#AutoML#AIautomation#datasmog#algorithmicmanagers#Moderntools#dataprogramming#GoogleAI#DataRobot#Google#Microsof#news#blockgeni This article has been pubished from a wire agency feed without modifications to the text. Only the headline hs been changed. Source link Read the full article

0 notes

academicatheism · 8 years ago

Text

What is Post-Theism?

A few of you might be confused by the post-theist label. No, this does not mean I’m a theist unaffiliated with organized religion. This doesn’t mean I believe in a deity. Post-theism describes an attitude that we are beyond the god question. The atheist label no longer makes sense because the question of god is a settled fact; a god doesn’t exist and never did, so I don’t lack belief, but rather proceed with the knowledge that there’s no god and conduct my life as such.

I no longer dwell on the question or consider the question. Yes, this is compatible with gnostic atheism because it requires knowledge rather than mere non-belief sans knowledge, i.e., agnostic atheism. However, the question of whether a god exists no longer interests me; it no longer occupies my time in that it’s something I give no thought to. Religion and belief in god is a relic of human history. So I am as post-atheistic as I am post-theistic.

Post-(a)theism is a stronger position in that it isn’t a proclamation of non-belief or even knowledge of there being no god. It’s a stronger claim: religion was borne out of human ignorance; our lack of scientific knowledge, historical knowledge, philosophical understanding and reasoning, and technological progress resulted in a belief stemming from agency over-detection, among other fallacious conclusions. Religion was the result of primitive thinking, underdeveloped reasoning, and a severe misapprehension of the world we live in.

In many ways we are all post-theistic in that we don’t attribute lightning, tidal waves, strong winds, volcanic eruptions, and earthquakes to the wrath of a god. We moved passed polytheistic explanations of natural phenomena and remain only with the palpably silly idea that a god created the universe and world. I am at a point where those notions are as ridiculous as the idea that Zeus launches every lightning bolt everywhere -- including on planets like Jupiter. What I’ve learned about causation, the dispositions of material objects, and the universe doesn’t allow for such an explanation; never mind that god is a human projection, a way of seeing our own image even behind phenomena we can’t even begin to control.

God is the name of an idealized human, infinite in every domain we are finite in: infinitely knowledgeable, powerful, moral, and good; every one of us will die and yet god is considered eternal. God is the name of human naiveté and arrogance, the notion that the creator of the universe must be a perfect version of ourselves. God is the name of the lack of imagination of our ancestors. If anything, imagination hasn’t discovered a super-human controlling and governing the universe; imagination has discovered natural forces that move celestial bodies and oversee their formation; imagination has scaled down the universe to previously incomprehensible small scales; imagination has proven once and for all that the universe is probabilistic, that chance rather than agency is more prevalent in the universe. Imagination has shown that the idea of god was borne from a lack of creativity rather than masterful ingenuity. Whether you like it or not, we are beyond the need for god as ultimate explanation or temporary placeholder; we are beyond the question of whether one exists. This is the age of post-theism.

#atheism #theism #post-theism #philosophy #religion #god

217 notes · View notes

xhxhxhx · 8 years ago

Text

Inequality and Rebellion

Does inequality cause conflict? @deusvulture tells us what common sense would tell us: Of course it does.

income inequality qua income inequality isn’t really an economic issue; it’s a quality-of-life issue and a security issue. For some reason, people understand this when it comes to global inequality but not when it comes to intranational inequality.

It causes interstate conflict. It causes intrastate conflict. Inequality’s effect on conflict is so serious that it isn’t an economic issue. It’s a security issue. That’s the common sense answer. It’s the intuitive answer. Amartya Sen, in the introduction to On Economic Inequality (1973), wrote that “the relation between inequality and rebellion is indeed a close one,” and who could disagree with that?

It would be the easiest thing to test: We know how to measure inequality. We know how to measure conflict. We have strong theories of causation. If the relationship was that strong, the findings should have been clear, consistent, strong, and positive.

They weren’t.

I’ll let Mark Irving Lichbach tell you the details. From “An Evaluation of ‘Does Economic Inequality Breed Political Conflict?’ Studies” World Politics 41:4 (1989):

In sum, two decades of empirical research in conflict studies have challenged the conventionally accepted view that a strong positive relationship exists between economic inequality and political conflict, [economic inequality-political conflict] studies have produced an equivocal answer about the EI-PC nexus. While numerous analyses purport to show that economic inequality has a positive impact on political dissent, others purport to show negative and negligible relationships. Midlarsky has stated that "rarely is there a robust relationship discovered between the two variables. Equally rarely does the relationship plunge into the depths of the black hole of nonsignificance.”

This diverse and contradictory array of findings has baffled and intrigued investigators. Hence, Midlarsky suggests that we locate many theories that are "context-specific," Mitchell that we have "two contrary theories of rebellion, each with some basis in fact," and Zimmerman offers so many qualifications to the EI-PC nexus that he implies that we might have no theories at all here!

Why have EI-PC studies produced contradictory results? As Dina Zinnes wrote in a review of quantitative studies of external war, "I find myself perplexed: why do tests of the same hypothesis using different data, research designs, and methodologies appear to produce such dramatically different conclusions ... ?”

Why? Because when the intuition ran ahead of the facts, the facts were massaged to fit, as researchers added “a little spice”:

Second, a little spice is involved: the initial speculation, a strong and positive relationship between economic inequality and political dissent, sometimes, but not always, conflicts with the data. Anomalous, inconsistent, and inconclusive findings provide grist for theoretical and empirical reformulations of the basic EI-PC idea.

That researcher freedom might have driven the field -- through “alternate definitions of economic inequality and of political conflict, and from the different cases explored, the various time frames in which the effects on conflict are examined, and the different ceteris paribus understandings about the context” -- but those anomalous, inconsistent, and inconclusive findings are hard to ignore.

The statistical modelers have revealed that no clear answer about the EI-PC nexus exists, and none is likely to emerge. The evidence thus supports the view that, in general, economic inequality is neither necessary, sufficient, nor clearly probabilistically related to dissent.

The decades have not changed that conclusion. Jeffrey Dixon’s survey of 46 quantitative studies in “What causes civil wars? Integrating quantitative research findings” International Studies Review 11 (2009) identified five studies that used income inequality as an independent variable explaining civil war. None found a statistically significant relationship between them.

Every paper in the field proposing a new relationship began with statements like this one, from Christopher Cramer’s “Does inequality cause conflict?” Journal of International Development 15 (2003):

The role of economic inequality in economic growth and in the political economy of violent conflict has remained elusive. This paper discusses why. One problem is how weak the empirical foundations remain for any argument or finding based, especially, on inter-country comparisons.

Or this one, from Marie Besançon’s “Relative Resources: Inequality in Ethnic Wars, Revolutions, and Genocides” Journal of Peace Research 42:4 (2005):

Political scientists, for decades, have argued that there is a nexus between economic inequality and political violence, yet these decades of studies have empirically challenged this view. Rarely have statistical studies resulted in a robust relationship between the two variables, and the results have often been contradictory and inconclusive.

The authors have their excuses. Cramer tells us to ignore the “superficial outward signs of inequality, for example the Gini coefficient” for the “historically conditioned social relations that, given their infinitely open set of specificities, nonetheless sometimes produce similar outward signs.” Cramer just wants us to add a little spice. Well, maybe not a little.

We do have strong and consistent predictors of social conflict. They just don’t include inequality. Paul Collier and Anke Hoeffler, in “Greed and Grievance in Civil War” Oxford Economic Papers 56:4 (2004), found that many variables simply do not matter. Not inequality. Not political rights. Not ethnic polarization. Not religious fractionalization.

So what did matter? Chris Blattman and Edward Miguel’s review paper, “Civil War,” in the Journal of Economic Literature 48:1 (2010) tells us that two factors are robustly linked to civil war: “low per capita incomes and slow economic growth.” Håvard Hefre and Nicholas Sambanis, “Sensitivity Analysis of Empirical Results on Civil War Onset” Journal of Conflict Resolution 50 (2006), which tested standard predictors for robustness, identified a few more:

... large population and low income levels, low rates of economic growth, recent political instability and inconsistent democratic institutions, small military establishments and rough terrain, and war-prone and undemocratic neighbors.

If you want to prevent civil war, you need to increase incomes and growth rates, limit political instability, expand your armed forces, and ensure your neighbors are peaceable democracies. Reducing inequality would not do much good.

Inequality doesn’t seem to matter.

#t #effortpost #eh #ps

50 notes · View notes

craigbrownphd · 5 years ago

Text

If you did not already know

0 notes

caplofan · 5 years ago

Text

The Unreasonable Importance of Data Preparation in 2020

youtube

The Unreasonable Importance of Data Preparation in 2020

In a world focused on buzzword-driven models and algorithms, you’d be forgiven for forgetting about the unreasonable importance of data preparation and quality: your models are only as good as the data you feed them.

This is the garbage in, garbage out principle: flawed data going in leads to flawed results, algorithms, and business decisions. If a self-driving car’s decision-making algorithm is trained on data of traffic collected during the day, you wouldn’t put it on the roads at night.

To take it a step further, if such an algorithm is trained in an environment with cars driven by humans, how can you expect it to perform well on roads with other self-driving cars?

Beyond the autonomous driving example described, the “garbage in” side of the equation can take many forms—for example, incorrectly entered data, poorly packaged data, and data collected incorrectly, more of which we’ll address below.

When executives ask me how to approach an AI transformation, I show them Monica Rogati’s AI Hierarchy of Needs, which has AI at the top, and everything is built upon the foundation of data (Rogati is a data science and AI advisor, former VP of data at Jawbone, and former LinkedIn data scientist):

AI Hierarchy of Needs 2020

Image courtesy of Monica Rogati, used with permission.

Why is high-quality and accessible data foundational?

If you’re basing business decisions on dashboards or the results of online experiments, you need to have the right data.

On the machine learning side, we are entering what Andrei Karpathy, director of AI at Tesla, dubs the Software 2.0 era, a new paradigm for software where machine learning and AI require less focus on writing code and more on configuring, selecting inputs, and iterating through data to create higher level models that learn from the data we give them.

In this new world, data has become a first-class citizen, where computation becomes increasingly probabilistic and programs no longer do the same thing each time they run.

The model and the data specification become more important than the code.

Collecting the right data requires a principled approach that is a function of your business question.

Data collected for one purpose can have limited use for other questions.

The assumed value of data is a myth leading to inflated valuations of start-ups capturing said data. John Myles White, data scientist and engineering manager at Facebook, wrote:

The biggest risk I see with data science projects is that analyzing data per se is generally a bad thing.

Generating data with a pre-specified analysis plan and running that analysis is good. Re-analyzing existing data is often very bad.”

John is drawing attention to thinking carefully about what you hope to get out of the data, what question you hope to answer, what biases may exist, and what you need to correct before jumping in with an analysis[1].

With the right mindset, you can get a lot out of analyzing existing data—for example, descriptive data is often quite useful for early-stage companies[2].

Not too long ago, “save everything” was a common maxim in tech; you never knew if you might need the data. However, attempting to repurpose pre-existing data can muddy the water by shifting the semantics from why the data was collected to the question you hope to answer. In particular, determining causation from correlation can be difficult.

For example, a pre-existing correlation pulled from an organization’s database should be tested in a new experiment and not assumed to imply causation[3], instead of this commonly encountered pattern in tech:

A large fraction of users that do X do Z Z is good Let’s get everybody to do X

Correlation in existing data is evidence for causation that then needs to be verified by collecting more data.

The same challenge plagues scientific research. Take the case of Brian Wansink, former head of the Food and Brand Lab at Cornell University, who stepped down after a Cornell faculty review reported he “committed academic misconduct in his research and scholarship, including misreporting of research data, problematic statistical techniques [and] failure to properly document and preserve research results.” One of his more egregious errors was to continually test already collected data for new hypotheses until one stuck, after his initial hypothesis failed[4]. NPR put it well: “the gold standard of scientific studies is to make a single hypothesis, gather data to test it, and analyze the results to see if it holds up. By Wansink’s own admission in the blog post, that’s not what happened in his lab.” He continually tried to fit new hypotheses unrelated to why he collected the data until he got a null hypothesis with an acceptable p-value—a perversion of the scientific method.

Data professionals spend an inordinate amount on time cleaning, repairing, and preparing data

common trope that data scientists spend 80% of their time on data preparation 2020

This is old news in many ways, but it’s old news that still plagues us: a recent O’Reilly survey found that lack of data or data quality issues was one of the main bottlenecks for further AI adoption for companies at the AI evaluation stage and was the main bottleneck for companies with mature AI practices.

Good quality datasets are all alike, but every low-quality dataset is low-quality in its own way[5]. Data can be low-quality if:

It doesn’t fit your question or its collection wasn’t carefully considered; It’s erroneous (it may say “cicago” for a location), inconsistent (it may say “cicago” in one place and “Chicago” in another), or missing; It’s good data but packaged in an atrocious way—e.g., it’s stored across a range of siloed databases in an organization; It requires human labeling to be useful (such as manually labeling emails as “spam” or “not” for a spam detection algorithm).

This definition of low-quality data defines quality as a function of how much work is required to get the data into an analysis-ready form. Look at the responses to my tweet for data quality nightmares that modern data professionals grapple with.

The importance of automating data preparation

Most of the conversation around AI automation involves automating machine learning models, a field known as AutoML.

This is important: consider how many modern models need to operate at scale and in real time (such as Google’s search engine and the relevant tweets that Twitter surfaces in your feed). We also need to be talking about automation of all steps in the data science workflow/pipeline, including those at the start. Why is it important to automate data preparation?

It occupies an inordinate amount of time for data professionals. Data drudgery automation in the era of data smog will free data scientists up for doing more interesting, creative work (such as modeling or interfacing with business questions and insights). “76% of data scientists view data preparation as the least enjoyable part of their work,” according to a CrowdFlower survey.

A series of subjective data preparation micro-decisions can bias your analysis. For example, one analyst may throw out data with missing values, another may infer the missing values. For more on how micro-decisions in analysis can impact results, I recommend Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results[6] (note that the analytical micro-decisions in this study are not only data preparation decisions).

Automating data preparation won’t necessarily remove such bias, but it will make it systematic, discoverable, auditable, unit-testable, and correctable. Model results will then be less reliant on individuals making hundreds of micro-decisions.

An added benefit is that the work will be reproducible and robust, in the sense that somebody else (say, in another department) can reproduce the analysis and get the same results[7];

For the increasing number of real-time algorithms in production, humans need to be taken out of the loop at runtime as much as possible (and perhaps be kept in the loop more as algorithmic managers): when you use Siri to make a reservation on OpenTable by asking for a table for four at a nearby Italian restaurant tonight, there’s a speech-to-text model, a geographic search model, and a restaurant-matching model, all working together in real time.

No data analysts/scientists work on this data pipeline as everything must happen in real time, requiring an automated data preparation and data quality workflow (e.g., to resolve if I say “eye-talian” instead of “it-atian”).

The third point above speaks more generally to the need for automation around all parts of the data science workflow. This need will grow as smart devices, IoT, voice assistants, drones, and augmented and virtual reality become more prevalent.

Automation represents a specific case of democratization, making data skills easily accessible for the broader population. Democratization involves both education (which I focus on in my work at DataCamp) and developing tools that many people can use.

Understanding the importance of general automation and democratization of all parts of the DS/ML/AI workflow, it’s important to recognize that we’ve done pretty well at democratizing data collection and gathering, modeling[8], and data reporting[9], but what remains stubbornly difficult is the whole process of preparing the data.

Modern tools for automating data cleaning and data preparation

We’re seeing the emergence of modern tools for automated data cleaning and preparation, such as HoloClean and Snorkel coming from Christopher Ré’s group at Stanford.

HoloClean decouples the task of data cleaning into error detection (such as recognizing that the location “cicago” is erroneous) and repairing erroneous data (such as changing “cicago” to “Chicago”), and formalizes the fact that “data cleaning is a statistical learning and inference problem.”

All data analysis and data science work is a combination of data, assumptions, and prior knowledge. So when you’re missing data or have “low-quality data,” you use assumptions, statistics, and inference to repair your data.

HoloClean performs this automatically in a principled, statistical manner. All the user needs to do is “to specify high-level assertions that capture their domain expertise with respect to invariants that the input data needs to satisfy. No other supervision is required!”

The HoloClean team also has a system for automating the “building and managing [of] training datasets without manual labeling” called Snorkel. Having correctly labeled data is a key part of preparing data to build machine learning models[10].

As more and more data is generated, manually labeling it is unfeasible.

Snorkel provides a way to automate labeling, using a modern paradigm called data programming, in which users are able to “inject domain information [or heuristics] into machine learning models in higher level, higher bandwidth ways than manually labeling thousands or millions of individual data points.”

Researchers at Google AI have adapted Snorkel to label data at industrial/web scale and demonstrated its utility in three scenarios: topic classification, product classification, and real-time event classification.

Snorkel doesn’t stop at data labeling. It also allows you to automate two other key aspects of data preparation:

Data augmentation—that is, creating more labeled data. Consider an image recognition problem in which you are trying to detect cars in photos for your self-driving car algorithm.

Classically, you’ll need at least several thousand labeled photos for your training dataset. If you don’t have enough training data and it’s too expensive to manually collect and label more data, you can create more by rotating and reflecting your images.

Discovery of critical data subsets—for example, figuring out which subsets of your data really help to distinguish spam from non-spam.

These are two of many current examples of the augmented data preparation revolution, which includes products from IBM and DataRobot.

The future of data tooling and data preparation as a cultural challenge

We’ve seen a lot of headway made in democratizing and automating data collection and building models. Just look at the emergence of drag-and-drop tools for machine learning workflows coming out of Google and Microsoft.

As we saw from the recent O’Reilly survey, data preparation and cleaning still take up a lot of time that data professionals don’t enjoy. For this reason, it’s exciting that we’re now starting to see headway in automated tooling for data cleaning and preparation. It will be interesting to see how this space grows and how the tools are adopted.

A bright future would see data preparation and data quality as first-class citizens in the data workflow, alongside machine learning, deep learning, and AI. Dealing with incorrect or missing data is unglamorous but necessary work.

It’s easy to justify working with data that’s obviously wrong; the only real surprise is the amount of time it takes. Understanding how to manage more subtle problems with data, such as data that reflects and perpetuates historical biases (for example, real estate redlining) is a more difficult organizational challenge.

This will require honest, open conversations in any organization around what data workflows actually look like.

The fact that business leaders are focused on predictive models and deep learning while data workers spend most of their time on data preparation is a cultural challenge, not a technical one. If this part of the data flow pipeline is going to be solved in the future, everybody needs to acknowledge and understand the challenge.

Original Source: The unreasonable importance of data preparation

Curated On: https://www.cashadvancepaydayloansonline.com/

The post The Unreasonable Importance of Data Preparation in 2020 appeared first on Cash Advance Payday Loans Online | Instant Payday Loans Online 2020.

source https://www.cashadvancepaydayloansonline.com/the-unreasonable-importance-of-data-preparation-in-2020/?utm_source=rss&utm_medium=rss&utm_campaign=the-unreasonable-importance-of-data-preparation-in-2020

#Finance News

0 notes

arxt1 · 6 years ago

Text

Bell's Theorem Versus Local Realism in a Quaternionic Model of Physical Space. (arXiv:1405.2355v12 [quant-ph] UPDATED)

In the context of EPR-Bohm type experiments and spin detections confined to spacelike hypersurfaces, a local, deterministic and realistic model within a Friedmann-Robertson-Walker spacetime with a constant spatial curvature (S^3) is presented that describes simultaneous measurements of the spins of two fermions emerging in a singlet state from the decay of a spinless boson. Exact agreement with the probabilistic predictions of quantum theory is achieved in the model without data rejection, remote contextuality, superdeterminism or backward causation. A singularity-free Clifford-algebraic representation of S^3 with vanishing spatial curvature and non-vanishing torsion is then employed to transform the model in a more elegant form. Several event-by-event numerical simulations of the model are presented, which confirm our analytical results with the accuracy of 4 parts in 10^4. Possible implications of our results for practical applications such as quantum security protocols and quantum computing are briefly discussed.

from gr-qc updates on arXiv.org https://ift.tt/1jln2HB

0 notes

ubebear · 6 years ago

Text

Avengers Endgame and Time Travel

Okay, I’ve been doing some research and I’ve come to the conclusion that the Russo brothers were using the Block Universe Theory OR Closed Timelike Curves to explain how time itself would work in Endgame. Disclaimer: I am no scientist, just a confused fan trying to figure this shit out.

It’s a long post, so click at your own risk :)

For Block Universe Theory, essentially past, present, and future are all occurring simultaneously. Even with by “traveling to the past” you cannot change it by existing in it, for that would be a contradiction to the already existing “future”. But considering these are all one in the same, your travel to a certain moment in spacetime doesn’t affect relativity of your actions. What is past for you will be the future for you as well, as this occurrence has and always will be in existence.

Confused? Yeah so are most people (myself included) when they try to learn about this theory. This wouldn’t be the first time you’ve seen it represented in movies though - another well-known, similar example is Interstellar. The scene where Matthew McConaughey’s character enters a 4th/5th dimension (some argue characteristics applied to our 3rd dimension such as weight or time count as a 4th dimension and thus he is in a 5th dimension but I personally don’t agree) that shows spacetime as represented by “blocks”. This is a good stab at a visual representation for all time events occurring simultaneously across our existence. He is able to speak with his daughter “in the past” while existing “in the future”.

I feel like this representation isn’t 100% accurate since it solely focuses on one point in time (the bookshelf) rather than all moments in time simultaneously, but eh. Semantics.

Okay, so second theory is Closed Timelike Curves (CTC) and honestly I think it’s more likely what the Russos had in mind, although they may have borrowed some from the Universal Block Theory. Not only does this theory deal with quantum mechanics/physics (hello Quantum Realm used for time travel) is uses more accepted theories as relating time travel to the bending of gravity and essentially creating a wormhole. It is notable that this theory shows “paradoxes created by CTCs could be avoided at the quantum scale because of the behavior of fundamental particles, which follow only the fuzzy rules of probability rather than strict determinism.” Listen, if I’m still confusing, the link at the top of this post explains CTC very well - go click that.

Ah, great, so this makes much more sense in Endgame in that we’re going off probability rather than strict causation/reaction science. Just having that, say, scientific wiggle room allows for this line after Rhodey suggest killing baby Thanos said by Banner:

“Time doesn’t work that way. Changing the past doesn’t change the future... if you travel to the past, that past becomes your future. And your former present becomes the past. Which can’t now be changed by your new future.”

Same, Rhodey, same. I... think Banner is just wrong in his explanation? Did he invent time travel? NO. Tony did. So sorry if I don’t take Bruce’s word as law on this. He even says earlier in the movie that “time travel do-over isn’t my area of expertise.” I much prefer the CTC Theory that says basically all of the Avengers existed with a certain degree of probability of going back in time to take the infinity stones and bring them to present day. Nat had a probability of going back in time and dying. These events weren’t certain but their possibility was accounted for upon their particles’ creation in the realm of spacetime. The linked article has a similar example:

“Instead of a human being traversing a CTC to kill her ancestor, imagine that a fundamental particle goes back in time to flip a switch on the particle-generating machine that created it. If the particle flips the switch, the machine emits a particle—the particle—back into the CTC; if the switch isn't flipped, the machine emits nothing. In this scenario there is no a priori deterministic certainty to the particle's emission, only a distribution of probabilities. Deutsch's insight was to postulate self-consistency in the quantum realm, to insist that any particle entering one end of a CTC must emerge at the other end with identical properties. Therefore, a particle emitted by the machine with a probability of one half would enter the CTC and come out the other end to flip the switch with a probability of one half, imbuing itself at birth with a probability of one half of going back to flip the switch. If the particle were a person, she would be born with a one-half probability of killing her grandfather, giving her grandfather a one-half probability of escaping death at her hands—good enough in probabilistic terms to close the causative loop and escape the paradox. Strange though it may be, this solution is in keeping with the known laws of quantum mechanics.”

Alright, if you’ve read this far without mentally telling me, the confused messenger, to fuck off you get a gold star.

So Tony talks about an inverted Mobius Strip as his diagram for accurate time travel and I have to say it would resemble at CTC much more than a Block Universe or a linear timeline. This would show that hey when we travel we aren’t messing with the past or future because our actions have essentially been accounted for already. (I think?) This is why I didn’t like the Supreme Sorceress’ visual of how spacetime works because it looks like a linear timeline/reality which goes against everything else the movie has set up.

I believe you could have rips in spacetime that cause an alternate reality, just like the above CTC shows a tear, but I’m not so sure if it would send you to another reality since your divergence would’ve been accounted for. This explains everything except where Loki may have gone, but he had an infinity stone so that could mean anything.

Alright I’m tired and over talking about time theories. Does any of this even make sense? I NEED SHURI TO EXPLAIN. I think the answer is we don’t know and it all isn’t going to make sense. Why? Cause if you could give me an infallible time travel theory WE WOULD BE TIME TRAVELING IRL.

So just... enjoy cry about the movie or whatevs.

Bonus:

Another time theory is the Sapir-Whorf hypothesis, which “holds that our perception of reality is either altered or determined by the language we speak.” Arrival - the awesome alien movie starring Amy Adams and Jeremy Renner showcases it. It’s been a while since I’ve watched it, but I believe by speaking to the aliens and understanding their perception of communication, Amy’s character also develops an understanding for time relativity by learning the aliens’ language and “time travels” to the “future” to affect the decision of another character. This is described as backward causation however, not inherently time travel related, though similar. Instead of Event A occurring first and causing Event B, Event B, occurring second is the causation for Event A. The link explains it with examples.

#avengers endgame #endgame spoilers #a4 spoilers #avengers 4 #mcu #marvel #time travel #time travel fiasco #block universe theory #closed timelike curves #bad science #tony stark #bruce banner #quantum realm #mine

0 notes

kristinsimmons · 7 years ago

Text

Evidence-Based Satire

By SAURABH JHA

Sequels generally disappoint. Jason couldn’t match the fear he generated in the original Friday the 13th. The sequel to the Parachute, a satirical piece canvassing PubMed for randomized controlled trials (RCTs) comparing parachutes to placebo, matched its brilliance, and even exceeded it, though the margin can’t be confirmed with statistical significance. The Parachute, published in BMJ’s Christmas edition, will go down in history with Jonathan Swift’s Modest Proposal and Frederic Bastiat’s Candlemakers’ Petition as timeless satire in which pedagogy punched above, indeed depended on, their absurdity.

In the Parachute, researchers concluded, deadpan, that since no RCT has tested the efficacy of parachutes when jumping off a plane, there is insufficient evidence to recommend them. At first glance, the joke was on RCTs and those who have an unmoored zeal for them. But that’d be a satirical conclusion. Sure, some want RCTs for everything, for whom absence of evidence means no evidence. But that’s because of a bigger problem which is that we refuse to acknowledge that causality has degrees, shades of gray, yet causality can sometimes be black and white. Somethings are self-evident.

In medicine, causation, even when it’s not correlation, is often probabilistic. Even the dreaded cerebral malaria doesn’t kill everyone. If you jump from a plane at 10, 000 feet without a parachute death isn’t probabilistic, it is certain. And we know this despite the absence of rigorous empiricism. It’s common sense. We need sound science to tease apart probabilities, and grayer the causality the sounder the empiricism must be to accord the treatment its correct quantitative benefit, the apotheosis of this sound science being an RCT. When empiricism ventures into certainties, it’s no longer sound science. It is parody.

If the femoral artery is nicked and blood spurts to the ceiling more forcefully than Bellagio’s fountains you don’t need an RCT to make a case for stopping the bleeding, even though all bleeding stops, eventually. But you do need an RCT if you’re testing which of the fine sutures at your disposal is better at sewing the femoral artery. The key point is the treatment effect – the mere act of stopping the bleed is a parachute, a huge treatment effect, which’d be idiotic to test in an RCT. Improving on the high treatment effect, even or particularly modestly, needs an RCT. The history of medicine is the history of parachutes and finer parachutes. RCTs became important when newer parachutes allegedly became better than their predecessors.

The point of the parachute satire is that the obvious doesn’t need empirical evidence. It is a joke on non-judgmentalism, or egalitarianism of judgment, on the objectively sincere but willfully naïve null hypothesis where all things remain equally possible until we have data.

There has been no RCT showing that cleaning one’s posterior after expulsion of detritus improves outcomes over placebo. This is our daily parachute. Yet some in the east may justifiably protest the superiority of the Occidental method of cleaning over their method of using hand and water without a well-designed RCT. Okay, that’s too much information. Plus, I’m unsure such an RCT would even be feasible as the cross over rate would be so high that no propensity-matching will adjust for the intention to wipe, but you get my drift.

The original Parachute satire is now folklore with an impressive H-index to boot. That it has been cited over thousand times is also satirical – the joke is on the H-index, a seriously flawed metric which is taken very seriously by serious academics. But it also means that to get a joke into a peer review publication you need to have a citation for your joke! The joke is also on the criminally unfunny Reviewer 2.

The problem with the parachute metaphor is that many physicians want their pet treatment, believing it to be a parachute, to be exempt from an RCT. This, too, is a consequence of non-judgmentalism, a scientific relativism where every shade of gray thinks it is black and white. One physician’s parachute is another physician’s umbrella. This is partly a result of the problem RCTs are trying to solve – treatment effects are probabilistic and when the added margins are so small, parachutes become difficult to disprove with certainty. You can’t rule out a parachute.

Patient: Was it God who I should thank for saving me from cardiogenic shock?

Cardiologist: In hindsight, I think it was a parachute.

Patient: Does this parachute have a name?

Cardiologist: We call it Impella.

Patient: Praise be to the Impella.

Cardiologist: Wait, it may have been the Swan Ganz catheter. Perhaps two parachutes saved you. Or maybe three, if we include Crestor.

The problem with RCTs is agreeing on equipoise – a state of genuine uncertainty that an intervention has net benefits. Equipoise is a tricky beast which exposes the parachute problem. If two dogmatic cardiac imagers are both certain that cardiac CT and SPECT, respectively, are the best first line test for suspected ischemia, then there’s equipoise. That they’re both certain about their respective modality doesn’t lessen the equipoise. That they disagree so vehemently with each other merely confirms equipoise. The key point is that when one physician thinks an intervention is a parachute and the other believes it’s an umbrella, there’s equipoise.

Equipoise, a zone of maximum uncertainty, is a war zone. We disagree most passionately about smallest effect sizes. No one argues about the efficacy of parachutes. To do an RCT you need consensus that there is equipoise. But the first rule of equipoise is that some believe there’s no equipoise – this is the crux of the tension. You can’t recruit cardiac imagers to a multi-center RCT comparing cardiac CT to SPECT if they believe SPECT is a parachute.

Consensus inevitably drifts to the lowest common denominator. As an example, when my family plans to eat out there’s fierce disagreement between my wife – who likes the finer taste of French cuisine, my kids – whose Americanized palate favors pizza, and me – my Neanderthalic palate craves goat curry. We argue and then we end up eating rice and lentils at home. Consensus is an equal opportunity spoil sport.

Equipoise has become bland and RCTs, instead of being daring, often recruit the lowest-risk patients for an intervention. RCTs have become contrived show rooms with the generalizability of Potemkin villages. Parachute’s sequel was a multi-center RCT in which people jumping from an aircraft were randomized to parachutes and backpack. There was no crossover. Protocol violation was nil but there was a cheeky catch. The aircraft was on the ground. Thus, the first RCT of parachutes, powered to make us laugh, was a null trial.

Point taken. But what was their point? Simply put, parachutes are useless if not needed. The pedagogy delivered was resounding precisely because of the absurdity of the trial. If you want to generalize an RCT you must choose the right patients, sick patients, patients on whom you’d actually use the treatment you’re testing. You must get your equipoise right. That was their point, made brilliantly. The joke wasn’t on RCTs; the joke was on equipoise. Equipoise is now the safest of safe spaces; college, joke-phobic, millennials would be envious. Equipoise is bollux.

The “Parachute Returns” satire had a mixed reception with audible consternation in some quarters. Though it may just be me and, admittedly, I find making Germans laugh easier than Americans, I was surprised by the provenance of the researchers, who hailed from Boston, better known for serious quantitative social engineers than stand-up quantitative comedians. Satire is best when it mocks your biases.

The quantitative sciences have become parody even, or particularly, when they don’t intend satire. An endlessly cited study concluded that medical errors are the third leading cause of death. The researchers estimated the national burden of medical errors from a mere thirty-five patients; it was the empirical version of feeding the multitude – the story from the New Testament of feeding of 5000 from five loaves and two breads. How can one take researchers seriously? I couldn’t. I had no rebuttal except satire.

In the age of unprecedented data-driven rationalism satire keeps judgment alive. To be fair, the statisticians, the gatekeepers of the quantitative sciences, have a stronger handle on satire than doctors. The Gaussian distribution has in-built absurdity. For example, because height follows a normal distribution, and the tails of the bell-shaped curve go on and on, a quantitative purist may conclude there’s a non-zero chance that an adult can be taller than a street light; it’s our judgment which says that this isn’t just improbable but impossible. Gauss might have pleaded – don’t take me literally, I mean statistically, I’m only an approximation.

A statistician once showed that the myth of storks delivering babies can’t empirically be falsified. There is, indeed, a correlation in Europe between live births and storks. The correlation coefficient was 0.62 with a p-value of 0.008. Radiologists would love to have that degree of correlation with each other when reading chest radiographs. The joke wasn’t on storks but simple linear regression, and for all the “correlation isn’t causation” wisdom, the pedagogic value of “stork deliver babies” is priceless.

If faith began where our scientific understanding ended, satire marks the boundaries of statistical certainty. Satire marks no-go areas where judgment still reigns supreme; a real estate larger than many believe. The irony of uncertainty is that we’re most uncertain of the true nature of treatment differences when the differences are the smallest. It’s easy seeing that Everest is taller than Matterhorn. But it takes more sophisticated measuring to confirm that Lhotse is taller than Makalu. The sophistication required of the quantitative sciences is inversely proportional to the effect size it seeks to prove. It’s as if mathematics is asking us to take a chill pill.

The penumbra of uncertainty is an eternal flame. Though the conventional wisdom is that a large enough sample size can douse uncertainty, even large n’s create problems. The renowned psychologist and uber researcher Paul Meehl conjectured that as the sample size approaches infinity there’s 50 % chance that we’ll reject the null hypothesis when we shouldn’t. With large sample sizes everything becomes statistically significant. Small n increases uncertainty and large n increases irrelevance. What a poetic trade-off! If psychology research has reproducibility problems, epidemiology is one giant shruggie.

When our endeavors become too big for their boots satire rears its absurd head. Satire is our check and balance. We’re trying to get too much out of the quantitative sciences. Satire marks the territory empiricism should stay clear of. If empiricism befriended satire it could be even greater, because satire keeps us humble.

The absurd coexists with the serious and like pigs and farmers resembling each other in the closing scene of Animal Farm, it’s no longer possible to tell apart the deservedly serious from the blithering nonsense. And that’s why we need satire more than ever.

Congratulations to the BMJ for keeping satire alive.

Merry Christmas.

About the Author

Saurabh Jha is a frequent author of satire, and sometimes its subject. He can be reached on Twitter @RogueRad

Evidence-Based Satire published first on https://wittooth.tumblr.com/

0 notes